Add skiplist optimization to auto_date_histogram aggregation #20057

asimmahmood1 · 2025-11-19T18:40:41Z

Description

This is a built on top #19573.

Compare to date histogram skiplist, this change needs to hook into dynamic rounding during collection. There are 2 variables to keep track of:

bucketOrds - seen rounded dates
preparedRounding - starts at lowest interval: MINUTES and goes up

When a new ord is created, increaseRoundingIfNeeded function is called to determine if new preparedRounding needs to kick in (e.g. from HOURS to DAYS), and may also merge dates in bucketOrds. Thus, both are need to be supplied via lambda.

In the future skiplist can be enhanced to keep track of multiple owningBucketOrd, for now it only works when auto date histogram is root (parent == null), or within range filter rewrite context that guarantees new auto date histogram is created per range.

Related Issues

Resolves #19827
Part of #18882
Also #19384

Check List

Functionality includes testing.
[n/a] API changes companion pull request created, if applicable.
[TODO] Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

New Features
- Skiplist optimization for auto_date_histogram to speed up bucketing.
- Public helper to evaluate skiplist eligibility at runtime.
Improvements
- Dynamic rounding support in histogram aggregations to handle runtime interval changes.
- Performance gains and added debug counters when the skiplist path is used.
Tests
- New tests covering skiplist behavior, rounding-change scenarios, bucket merging, and sub-aggregation equivalence.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

github-actions · 2025-11-19T18:55:36Z

❌ Gradle check result for 27f8248: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

asimmahmood1 · 2025-11-19T19:25:21Z

Functionally correct but not showing improvement

diff 17_big5_auto_date_filter_baseline.json 17_big5_auto_date_filter_candidate.json
2c2
<   "took": 114,
---
>   "took": 214,

Query

curl -X POST "http://localhost:9200/big5/_search"     -H "Content-Type: application/json"     -d '{
      "size": 0,
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "process.name": "systemd"
              }
            }
          ]
        }
      },
      "aggs": {
        "by_hour": {
          "auto_date_histogram": {
            "field": "@timestamp",
            "buckets": 3
          }
        }
      }
    }'

Result

{
  "took": 127,
  "timed_out": false,
  "terminated_early": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "by_hour": {
      "buckets": [
        {
          "key_as_string": "2023-01-01T00:00:00.000Z",
          "key": 1672531200000,
          "doc_count": 2488712
        },
        {
          "key_as_string": "2023-01-08T00:00:00.000Z",
          "key": 1673136000000,
          "doc_count": 824998
        }
      ],
      "interval": "7d"
    }
  }
}

asimmahmood1 · 2025-11-19T19:26:44Z

auto-date-agg-baseline.html
auto-date-agg-skiplist.html

jainankitk · 2025-11-19T21:39:39Z

@asimmahmood1 - Have you verified that the regression is only with AutoDateHistogramAggregator and not when using DateHistogramAggregator for same query?

asimmahmood1 · 2025-11-24T17:36:04Z

So I figured out why performance was not up to par. Short answer is when auto date moves onto large intervals, we need to track of not just preparedRounding but also bucketOrds. If preparedRounding has changed since last time, we need to restart the skiplist logic. Otherwise, we'll collect too many docs, and although the end result doesn't change (i.e. unit test passes), performance is too low.

Auto date histo has two modes: FromSingle and FromMany. FromSingle is often used in top level aggregation, so It's similar to Date Histogram where parent is null. FromMany is used e.g in big5's range-auto-date-histo, which would normally handle interleaving owningBucketOrd. In the special case where filter rewrite logic is used, then we can safely assume that only 1 owningBucketOrd will be called per leaf collector, thus we can use skiplist histogram.

See validation below.

Note: will update this PR after #19573 is merged.

asimmahmood1 · 2025-11-24T17:38:54Z

range auto date: from 1335 to 139 (89%)

11_range_auto_date_histo.sh
#!/bin/bash

curl -XGET 'http://localhost:9200/big5/_search' \
-H 'Content-Type: application/json' \
-d '{
  "size": 0, "profile": false,
  "aggs": {
    "tmax": {
      "range": {
        "field": "metrics.size",
        "ranges": [
          {
            "to": -10
          },
          {
            "from": -10,
            "to": 10
          },
          {
            "from": 10,
            "to": 100
          },
          {
            "from": 100,
            "to": 1000
          },
          {
            "from": 1000,
            "to": 2000
          },
          {
            "from": 2000
          }
        ]
      },
      "aggs": {
        "date": {
          "auto_date_histogram": {
            "field": "@timestamp",
            "buckets": 20
          }
        }
      }
    }
  }
}'

diff 11_range_auto_date_histo_candidate.json 11_range_auto_date_histo_baseline.json
2c2
<   "took": 139,
---
>   "took": 1335,

asimmahmood1 · 2025-11-24T17:48:10Z

range-auto-date-with-metrics (22% lower)

this is similar to the date-with-metrics, since the time is bounded by tavg stat

[ec2-user@ip-172-31-61-197 ~]$ diff 11_range_auto_date_histo_with_metrics_candidate.json 11_range_auto_date_histo_with_metrics_baseline.json
2c2
<   "took": 2920,
---
>   "took": 3781,

github-actions · 2025-11-24T19:06:58Z

❌ Gradle check result for 4d7f22d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

asimmahmood1 · 2025-11-24T21:41:06Z

{"run-benchmark-test": "id_3"}

asimmahmood1 · 2025-11-24T21:41:11Z

{"run-benchmark-test": "id_11"}

github-actions · 2025-11-24T22:06:38Z

❌ Gradle check result for 4d7f22d:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-11-25T01:06:09Z

The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5174/ . Final results will be published once the job is completed.

opensearch-ci-bot · 2025-11-25T02:38:37Z

Benchmark Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5174/

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		0	min
Min cumulative indexing time across primary shards		0	min
Median cumulative indexing time across primary shards		0	min
Max cumulative indexing time across primary shards		0	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		0	min
Cumulative merge count of primary shards		0
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		0	min
Max cumulative merge time across primary shards		0	min
Cumulative merge throttle time of primary shards		0	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0	min
Max cumulative merge throttle time across primary shards		0	min
Cumulative refresh time of primary shards		0	min
Cumulative refresh count of primary shards		31
Min cumulative refresh time across primary shards		0	min
Median cumulative refresh time across primary shards		0	min
Max cumulative refresh time across primary shards		0	min
Cumulative flush time of primary shards		0	min
Cumulative flush count of primary shards		8
Min cumulative flush time across primary shards		0	min
Median cumulative flush time across primary shards		0	min
Max cumulative flush time across primary shards		0	min
Total Young Gen GC time		2.18	s
Total Young Gen GC count		71
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		15.3221	GB
Translog size		4.09782e-07	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		73
100th percentile latency	wait-for-snapshot-recovery	300001	ms
100th percentile service time	wait-for-snapshot-recovery	300001	ms
error rate	wait-for-snapshot-recovery	100	%
Min Throughput	match-all	8	ops/s
Mean Throughput	match-all	8	ops/s
Median Throughput	match-all	8	ops/s
Max Throughput	match-all	8	ops/s
50th percentile latency	match-all	4.28224	ms
90th percentile latency	match-all	4.91128	ms
99th percentile latency	match-all	5.84384	ms
100th percentile latency	match-all	5.86495	ms
50th percentile service time	match-all	3.42397	ms
90th percentile service time	match-all	3.82077	ms
99th percentile service time	match-all	4.51325	ms
100th percentile service time	match-all	4.65042	ms
error rate	match-all	0	%
Min Throughput	term	49.9	ops/s
Mean Throughput	term	49.9	ops/s
Median Throughput	term	49.9	ops/s
Max Throughput	term	49.91	ops/s
50th percentile latency	term	3.64265	ms
90th percentile latency	term	4.15728	ms
99th percentile latency	term	9.31213	ms
100th percentile latency	term	14.2462	ms
50th percentile service time	term	2.90563	ms
90th percentile service time	term	3.16569	ms
99th percentile service time	term	8.3056	ms
100th percentile service time	term	13.2171	ms
error rate	term	0	%
Min Throughput	range	1	ops/s
Mean Throughput	range	1.01	ops/s
Median Throughput	range	1.01	ops/s
Max Throughput	range	1.01	ops/s
50th percentile latency	range	6.06911	ms
90th percentile latency	range	6.68034	ms
99th percentile latency	range	7.30484	ms
100th percentile latency	range	7.37473	ms
50th percentile service time	range	4.3927	ms
90th percentile service time	range	4.67829	ms
99th percentile service time	range	5.61881	ms
100th percentile service time	range	5.64028	ms
error rate	range	0	%
Min Throughput	200s-in-range	32.93	ops/s
Mean Throughput	200s-in-range	32.93	ops/s
Median Throughput	200s-in-range	32.93	ops/s
Max Throughput	200s-in-range	32.94	ops/s
50th percentile latency	200s-in-range	5.10912	ms
90th percentile latency	200s-in-range	6.15178	ms
99th percentile latency	200s-in-range	7.16864	ms
100th percentile latency	200s-in-range	7.41385	ms
50th percentile service time	200s-in-range	3.93731	ms
90th percentile service time	200s-in-range	4.25686	ms
99th percentile service time	200s-in-range	5.041	ms
100th percentile service time	200s-in-range	5.14181	ms
error rate	200s-in-range	0	%
Min Throughput	400s-in-range	50.03	ops/s
Mean Throughput	400s-in-range	50.03	ops/s
Median Throughput	400s-in-range	50.03	ops/s
Max Throughput	400s-in-range	50.03	ops/s
50th percentile latency	400s-in-range	3.43545	ms
90th percentile latency	400s-in-range	3.88271	ms
99th percentile latency	400s-in-range	9.29701	ms
100th percentile latency	400s-in-range	14.4099	ms
50th percentile service time	400s-in-range	2.62956	ms
90th percentile service time	400s-in-range	2.79619	ms
99th percentile service time	400s-in-range	8.37331	ms
100th percentile service time	400s-in-range	13.362	ms
error rate	400s-in-range	0	%
Min Throughput	hourly_agg	1.01	ops/s
Mean Throughput	hourly_agg	1.01	ops/s
Median Throughput	hourly_agg	1.01	ops/s
Max Throughput	hourly_agg	1.02	ops/s
50th percentile latency	hourly_agg	13.9039	ms
90th percentile latency	hourly_agg	15.0932	ms
99th percentile latency	hourly_agg	16.7868	ms
100th percentile latency	hourly_agg	17.49	ms
50th percentile service time	hourly_agg	12.0059	ms
90th percentile service time	hourly_agg	13.004	ms
99th percentile service time	hourly_agg	14.7653	ms
100th percentile service time	hourly_agg	15.154	ms
error rate	hourly_agg	0	%
Min Throughput	hourly_agg_with_filter	1	ops/s
Mean Throughput	hourly_agg_with_filter	1	ops/s
Median Throughput	hourly_agg_with_filter	1	ops/s
Max Throughput	hourly_agg_with_filter	1	ops/s
50th percentile latency	hourly_agg_with_filter	83.5865	ms
90th percentile latency	hourly_agg_with_filter	94.7305	ms
99th percentile latency	hourly_agg_with_filter	140.956	ms
100th percentile latency	hourly_agg_with_filter	183.002	ms
50th percentile service time	hourly_agg_with_filter	81.8279	ms
90th percentile service time	hourly_agg_with_filter	92.8934	ms
99th percentile service time	hourly_agg_with_filter	139.147	ms
100th percentile service time	hourly_agg_with_filter	181.211	ms
error rate	hourly_agg_with_filter	0	%
Min Throughput	hourly_agg_with_filter_and_metrics	0.24	ops/s
Mean Throughput	hourly_agg_with_filter_and_metrics	0.24	ops/s
Median Throughput	hourly_agg_with_filter_and_metrics	0.24	ops/s
Max Throughput	hourly_agg_with_filter_and_metrics	0.24	ops/s
50th percentile latency	hourly_agg_with_filter_and_metrics	323483	ms
90th percentile latency	hourly_agg_with_filter_and_metrics	451170	ms
99th percentile latency	hourly_agg_with_filter_and_metrics	479744	ms
100th percentile latency	hourly_agg_with_filter_and_metrics	481334	ms
50th percentile service time	hourly_agg_with_filter_and_metrics	4176.52	ms
90th percentile service time	hourly_agg_with_filter_and_metrics	4281.44	ms
99th percentile service time	hourly_agg_with_filter_and_metrics	4475.93	ms
100th percentile service time	hourly_agg_with_filter_and_metrics	4558.98	ms
error rate	hourly_agg_with_filter_and_metrics	0	%
Min Throughput	multi_term_agg	0.22	ops/s
Mean Throughput	multi_term_agg	0.22	ops/s
Median Throughput	multi_term_agg	0.22	ops/s
Max Throughput	multi_term_agg	0.23	ops/s
50th percentile latency	multi_term_agg	347090	ms
90th percentile latency	multi_term_agg	485132	ms
99th percentile latency	multi_term_agg	516024	ms
100th percentile latency	multi_term_agg	517783	ms
50th percentile service time	multi_term_agg	4506.97	ms
90th percentile service time	multi_term_agg	4626.37	ms
99th percentile service time	multi_term_agg	4866.29	ms
100th percentile service time	multi_term_agg	4986.95	ms
error rate	multi_term_agg	0	%
Min Throughput	scroll	25.04	pages/s
Mean Throughput	scroll	25.07	pages/s
Median Throughput	scroll	25.07	pages/s
Max Throughput	scroll	25.13	pages/s
50th percentile latency	scroll	207.343	ms
90th percentile latency	scroll	210.843	ms
99th percentile latency	scroll	264.671	ms
100th percentile latency	scroll	291.279	ms
50th percentile service time	scroll	205.387	ms
90th percentile service time	scroll	208.758	ms
99th percentile service time	scroll	262.554	ms
100th percentile service time	scroll	289.84	ms
error rate	scroll	0	%
Min Throughput	desc_sort_size	1	ops/s
Mean Throughput	desc_sort_size	1	ops/s
Median Throughput	desc_sort_size	1	ops/s
Max Throughput	desc_sort_size	1	ops/s
50th percentile latency	desc_sort_size	7.20233	ms
90th percentile latency	desc_sort_size	7.99693	ms
99th percentile latency	desc_sort_size	8.81847	ms
100th percentile latency	desc_sort_size	8.96338	ms
50th percentile service time	desc_sort_size	5.40942	ms
90th percentile service time	desc_sort_size	5.89622	ms
99th percentile service time	desc_sort_size	6.66956	ms
100th percentile service time	desc_sort_size	6.68186	ms
error rate	desc_sort_size	0	%
Min Throughput	asc_sort_size	1	ops/s
Mean Throughput	asc_sort_size	1	ops/s
Median Throughput	asc_sort_size	1	ops/s
Max Throughput	asc_sort_size	1	ops/s
50th percentile latency	asc_sort_size	8.26643	ms
90th percentile latency	asc_sort_size	8.92356	ms
99th percentile latency	asc_sort_size	9.54172	ms
100th percentile latency	asc_sort_size	9.60887	ms
50th percentile service time	asc_sort_size	6.35152	ms
90th percentile service time	asc_sort_size	7.0678	ms
99th percentile service time	asc_sort_size	7.44685	ms
100th percentile service time	asc_sort_size	7.53512	ms
error rate	asc_sort_size	0	%
Min Throughput	desc_sort_timestamp	1	ops/s
Mean Throughput	desc_sort_timestamp	1	ops/s
Median Throughput	desc_sort_timestamp	1	ops/s
Max Throughput	desc_sort_timestamp	1	ops/s
50th percentile latency	desc_sort_timestamp	13.5088	ms
90th percentile latency	desc_sort_timestamp	14.2314	ms
99th percentile latency	desc_sort_timestamp	15.988	ms
100th percentile latency	desc_sort_timestamp	16.0292	ms
50th percentile service time	desc_sort_timestamp	11.8524	ms
90th percentile service time	desc_sort_timestamp	12.2587	ms
99th percentile service time	desc_sort_timestamp	14.5756	ms
100th percentile service time	desc_sort_timestamp	14.6071	ms
error rate	desc_sort_timestamp	0	%
Min Throughput	asc_sort_timestamp	1	ops/s
Mean Throughput	asc_sort_timestamp	1	ops/s
Median Throughput	asc_sort_timestamp	1	ops/s
Max Throughput	asc_sort_timestamp	1	ops/s
50th percentile latency	asc_sort_timestamp	8.08182	ms
90th percentile latency	asc_sort_timestamp	8.76246	ms
99th percentile latency	asc_sort_timestamp	9.50709	ms
100th percentile latency	asc_sort_timestamp	10.022	ms
50th percentile service time	asc_sort_timestamp	6.20787	ms
90th percentile service time	asc_sort_timestamp	6.69686	ms
99th percentile service time	asc_sort_timestamp	7.68305	ms
100th percentile service time	asc_sort_timestamp	8.12289	ms
error rate	asc_sort_timestamp	0	%
Min Throughput	desc_sort_with_after_timestamp	1.01	ops/s
Mean Throughput	desc_sort_with_after_timestamp	1.02	ops/s
Median Throughput	desc_sort_with_after_timestamp	1.02	ops/s
Max Throughput	desc_sort_with_after_timestamp	1.1	ops/s
50th percentile latency	desc_sort_with_after_timestamp	6.27938	ms
90th percentile latency	desc_sort_with_after_timestamp	6.80572	ms
99th percentile latency	desc_sort_with_after_timestamp	7.46574	ms
100th percentile latency	desc_sort_with_after_timestamp	7.57326	ms
50th percentile service time	desc_sort_with_after_timestamp	4.42068	ms
90th percentile service time	desc_sort_with_after_timestamp	4.76673	ms
99th percentile service time	desc_sort_with_after_timestamp	5.54311	ms
100th percentile service time	desc_sort_with_after_timestamp	5.58049	ms
error rate	desc_sort_with_after_timestamp	0	%
Min Throughput	asc_sort_with_after_timestamp	1.01	ops/s
Mean Throughput	asc_sort_with_after_timestamp	1.02	ops/s
Median Throughput	asc_sort_with_after_timestamp	1.02	ops/s
Max Throughput	asc_sort_with_after_timestamp	1.1	ops/s
50th percentile latency	asc_sort_with_after_timestamp	5.39147	ms
90th percentile latency	asc_sort_with_after_timestamp	5.85555	ms
99th percentile latency	asc_sort_with_after_timestamp	6.20304	ms
100th percentile latency	asc_sort_with_after_timestamp	6.30199	ms
50th percentile service time	asc_sort_with_after_timestamp	3.60652	ms
90th percentile service time	asc_sort_with_after_timestamp	3.74439	ms
99th percentile service time	asc_sort_with_after_timestamp	3.92286	ms
100th percentile service time	asc_sort_with_after_timestamp	4.0068	ms
error rate	asc_sort_with_after_timestamp	0	%
Min Throughput	range_size	2.01	ops/s
Mean Throughput	range_size	2.01	ops/s
Median Throughput	range_size	2.01	ops/s
Max Throughput	range_size	2.02	ops/s
50th percentile latency	range_size	8.27167	ms
90th percentile latency	range_size	8.83461	ms
99th percentile latency	range_size	10.0267	ms
100th percentile latency	range_size	10.1073	ms
50th percentile service time	range_size	6.9538	ms
90th percentile service time	range_size	7.30531	ms
99th percentile service time	range_size	8.47684	ms
100th percentile service time	range_size	8.55189	ms
error rate	range_size	0	%
Min Throughput	range_with_asc_sort	2.01	ops/s
Mean Throughput	range_with_asc_sort	2.01	ops/s
Median Throughput	range_with_asc_sort	2.01	ops/s
Max Throughput	range_with_asc_sort	2.02	ops/s
50th percentile latency	range_with_asc_sort	18.8966	ms
90th percentile latency	range_with_asc_sort	20.8522	ms
99th percentile latency	range_with_asc_sort	22.2688	ms
100th percentile latency	range_with_asc_sort	22.4174	ms
50th percentile service time	range_with_asc_sort	17.4148	ms
90th percentile service time	range_with_asc_sort	19.218	ms
99th percentile service time	range_with_asc_sort	20.434	ms
100th percentile service time	range_with_asc_sort	20.5176	ms
error rate	range_with_asc_sort	0	%
Min Throughput	range_with_desc_sort	2.01	ops/s
Mean Throughput	range_with_desc_sort	2.01	ops/s
Median Throughput	range_with_desc_sort	2.01	ops/s
Max Throughput	range_with_desc_sort	2.02	ops/s
50th percentile latency	range_with_desc_sort	20.7926	ms
90th percentile latency	range_with_desc_sort	24.532	ms
99th percentile latency	range_with_desc_sort	33.4369	ms
100th percentile latency	range_with_desc_sort	41.1489	ms
50th percentile service time	range_with_desc_sort	18.6428	ms
90th percentile service time	range_with_desc_sort	22.7379	ms
99th percentile service time	range_with_desc_sort	31.1639	ms
100th percentile service time	range_with_desc_sort	38.9406	ms
error rate	range_with_desc_sort	0	%

opensearch-ci-bot · 2025-11-25T02:40:32Z

Benchmark Baseline Comparison Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/210/

Metric	Task	Baseline	Contender	Diff	Unit
Cumulative indexing time of primary shards		0	0	0	min
Min cumulative indexing time across primary shard		0	0	0	min
Median cumulative indexing time across primary shard		0	0	0	min
Max cumulative indexing time across primary shard		0	0	0	min
Cumulative indexing throttle time of primary shards		0	0	0	min
Min cumulative indexing throttle time across primary shard		0	0	0	min
Median cumulative indexing throttle time across primary shard		0	0	0	min
Max cumulative indexing throttle time across primary shard		0	0	0	min
Cumulative merge time of primary shards		0	0	0	min
Cumulative merge count of primary shards		0	0	0
Min cumulative merge time across primary shard		0	0	0	min
Median cumulative merge time across primary shard		0	0	0	min
Max cumulative merge time across primary shard		0	0	0	min
Cumulative merge throttle time of primary shards		0	0	0	min
Min cumulative merge throttle time across primary shard		0	0	0	min
Median cumulative merge throttle time across primary shard		0	0	0	min
Max cumulative merge throttle time across primary shard		0	0	0	min
Cumulative refresh time of primary shards		0	0	0	min
Cumulative refresh count of primary shards		31	31	0
Min cumulative refresh time across primary shard		0	0	0	min
Median cumulative refresh time across primary shard		0	0	0	min
Max cumulative refresh time across primary shard		0	0	0	min
Cumulative flush time of primary shards		0	0	0	min
Cumulative flush count of primary shards		8	8	0
Min cumulative flush time across primary shard		0	0	0	min
Median cumulative flush time across primary shard		0	0	0	min
Max cumulative flush time across primary shard		0	0	0	min
Total Young Gen GC time		2.16	2.18	0.02	s
Total Young Gen GC count		71	71	0
Total Old Gen GC time		0	0	0	s
Total Old Gen GC count		0	0	0
Store size		15.3221	15.3221	0	GB
Translog size		4.09782e-07	4.09782e-07	0	GB
Heap used for segments		0	0	0	MB
Heap used for doc values		0	0	0	MB
Heap used for terms		0	0	0	MB
Heap used for norms		0	0	0	MB
Heap used for points		0	0	0	MB
Heap used for stored fields		0	0	0	MB
Segment count		73	73	0
100th percentile latency	wait-for-snapshot-recovery	300002	300001	-0.46875	ms
100th percentile service time	wait-for-snapshot-recovery	300002	300001	-0.46875	ms
error rate	wait-for-snapshot-recovery	100	100	0	%
Min Throughput	match-all	8.00004	7.99863	-0.00142	ops/s
Mean Throughput	match-all	8.0001	7.99876	-0.00134	ops/s
Median Throughput	match-all	8.00011	7.99878	-0.00133	ops/s
Max Throughput	match-all	8.00013	7.99891	-0.00122	ops/s
50th percentile latency	match-all	4.08353	4.28224	0.1987	ms
90th percentile latency	match-all	4.69237	4.91128	0.21891	ms
99th percentile latency	match-all	5.07276	5.84384	0.77108	ms
100th percentile latency	match-all	5.15329	5.86495	0.71166	ms
50th percentile service time	match-all	3.09253	3.42397	0.33145	ms
90th percentile service time	match-all	3.56093	3.82077	0.25984	ms
99th percentile service time	match-all	4.31898	4.51325	0.19427	ms
100th percentile service time	match-all	4.38225	4.65042	0.26817	ms
error rate	match-all	0	0	0	%
Min Throughput	term	49.8653	49.898	0.03277	ops/s
Mean Throughput	term	49.8705	49.9015	0.03104	ops/s
Median Throughput	term	49.8705	49.9015	0.03104	ops/s
Max Throughput	term	49.8757	49.905	0.02931	ops/s
50th percentile latency	term	3.45205	3.64265	0.1906	ms
90th percentile latency	term	3.89325	4.15728	0.26403	ms
99th percentile latency	term	8.97703	9.31213	0.3351	ms
100th percentile latency	term	13.9687	14.2462	0.27751	ms
50th percentile service time	term	2.65867	2.90563	0.24697	ms
90th percentile service time	term	2.84279	3.16569	0.32289	ms
99th percentile service time	term	3.18447	8.3056	5.12113	ms
100th percentile service time	term	3.20051	13.2171	10.0166	ms
error rate	term	0	0	0	%
Min Throughput	range	1.00478	1.00465	-0.00013	ops/s
Mean Throughput	range	1.00662	1.00644	-0.00018	ops/s
Median Throughput	range	1.00636	1.00619	-0.00018	ops/s
Max Throughput	range	1.00951	1.00925	-0.00026	ops/s
50th percentile latency	range	6.32679	6.06911	-0.25767	ms
90th percentile latency	range	6.74759	6.68034	-0.06725	ms
99th percentile latency	range	14.9304	7.30484	-7.62552	ms
100th percentile latency	range	22.2679	7.37473	-14.8932	ms
50th percentile service time	range	4.40336	4.3927	-0.01066	ms
90th percentile service time	range	4.713	4.67829	-0.03471	ms
99th percentile service time	range	13.1272	5.61881	-7.50838	ms
100th percentile service time	range	20.2359	5.64028	-14.5956	ms
error rate	range	0	0	0	%
Min Throughput	200s-in-range	32.9022	32.9316	0.02934	ops/s
Mean Throughput	200s-in-range	32.9078	32.9337	0.02591	ops/s
Median Throughput	200s-in-range	32.9084	32.9323	0.0239	ops/s
Max Throughput	200s-in-range	32.9129	32.9374	0.02448	ops/s
50th percentile latency	200s-in-range	4.86638	5.10912	0.24274	ms
90th percentile latency	200s-in-range	5.66428	6.15178	0.4875	ms
99th percentile latency	200s-in-range	6.22094	7.16864	0.9477	ms
100th percentile latency	200s-in-range	6.60392	7.41385	0.80994	ms
50th percentile service time	200s-in-range	3.52999	3.93731	0.40733	ms
90th percentile service time	200s-in-range	3.69637	4.25686	0.5605	ms
99th percentile service time	200s-in-range	4.96872	5.041	0.07228	ms
100th percentile service time	200s-in-range	5.87261	5.14181	-0.7308	ms
error rate	200s-in-range	0	0	0	%
Min Throughput	400s-in-range	50.0106	50.033	0.02244	ops/s
Mean Throughput	400s-in-range	50.0119	50.034	0.02205	ops/s
Median Throughput	400s-in-range	50.0119	50.034	0.02205	ops/s
Max Throughput	400s-in-range	50.0132	50.0349	0.02166	ops/s
50th percentile latency	400s-in-range	3.69972	3.43545	-0.26427	ms
90th percentile latency	400s-in-range	4.12578	3.88271	-0.24306	ms
99th percentile latency	400s-in-range	9.65137	9.29701	-0.35436	ms
100th percentile latency	400s-in-range	14.8397	14.4099	-0.42975	ms
50th percentile service time	400s-in-range	2.91395	2.62956	-0.28439	ms
90th percentile service time	400s-in-range	3.02216	2.79619	-0.22597	ms
99th percentile service time	400s-in-range	8.69587	8.37331	-0.32257	ms
100th percentile service time	400s-in-range	13.9622	13.362	-0.60017	ms
error rate	400s-in-range	0	0	0	%
Min Throughput	hourly_agg	1.00566	1.00571	5e-05	ops/s
Mean Throughput	hourly_agg	1.00932	1.0094	8e-05	ops/s
Median Throughput	hourly_agg	1.00848	1.00855	7e-05	ops/s
Max Throughput	hourly_agg	1.01684	1.01699	0.00016	ops/s
50th percentile latency	hourly_agg	13.2876	13.9039	0.61626	ms
90th percentile latency	hourly_agg	14.3772	15.0932	0.71602	ms
99th percentile latency	hourly_agg	16.4744	16.7868	0.31244	ms
100th percentile latency	hourly_agg	16.7372	17.49	0.75281	ms
50th percentile service time	hourly_agg	11.4508	12.0059	0.55507	ms
90th percentile service time	hourly_agg	12.4074	13.004	0.59659	ms
99th percentile service time	hourly_agg	14.3855	14.7653	0.37978	ms
100th percentile service time	hourly_agg	14.8684	15.154	0.28559	ms
error rate	hourly_agg	0	0	0	%
Min Throughput	hourly_agg_with_filter	1.00298	1.00122	-0.00176	ops/s
Mean Throughput	hourly_agg_with_filter	1.00488	1.002	-0.00288	ops/s
Median Throughput	hourly_agg_with_filter	1.00445	1.00182	-0.00262	ops/s
Max Throughput	hourly_agg_with_filter	1.00879	1.00361	-0.00518	ops/s
50th percentile latency	hourly_agg_with_filter	81.6363	83.5865	1.95017	ms
90th percentile latency	hourly_agg_with_filter	92.412	94.7305	2.31843	ms
99th percentile latency	hourly_agg_with_filter	127.853	140.956	13.1029	ms
100th percentile latency	hourly_agg_with_filter	160.239	183.002	22.7638	ms
50th percentile service time	hourly_agg_with_filter	79.6116	81.8279	2.21627	ms
90th percentile service time	hourly_agg_with_filter	90.3268	92.8934	2.56658	ms
99th percentile service time	hourly_agg_with_filter	126	139.147	13.1461	ms
100th percentile service time	hourly_agg_with_filter	158.385	181.211	22.8264	ms
error rate	hourly_agg_with_filter	0	0	0	%
Min Throughput	hourly_agg_with_filter_and_metrics	0.216111	0.235085	0.01897	ops/s
Mean Throughput	hourly_agg_with_filter_and_metrics	0.216971	0.236491	0.01952	ops/s
Median Throughput	hourly_agg_with_filter_and_metrics	0.216991	0.236566	0.01958	ops/s
Max Throughput	hourly_agg_with_filter_and_metrics	0.217794	0.237297	0.0195	ops/s
50th percentile latency	hourly_agg_with_filter_and_metrics	363245	323483	-39762.4	ms
90th percentile latency	hourly_agg_with_filter_and_metrics	505752	451170	-54581.8	ms
99th percentile latency	hourly_agg_with_filter_and_metrics	538132	479744	-58388.8	ms
100th percentile latency	hourly_agg_with_filter_and_metrics	539921	481334	-58586.9	ms
50th percentile service time	hourly_agg_with_filter_and_metrics	4576.64	4176.52	-400.121	ms
90th percentile service time	hourly_agg_with_filter_and_metrics	4684.03	4281.44	-402.596	ms
99th percentile service time	hourly_agg_with_filter_and_metrics	4785.3	4475.93	-309.373	ms
100th percentile service time	hourly_agg_with_filter_and_metrics	4817.81	4558.98	-258.823	ms
error rate	hourly_agg_with_filter_and_metrics	0	0	0	%
Min Throughput	multi_term_agg	0.220417	0.224183	0.00377	ops/s
Mean Throughput	multi_term_agg	0.222457	0.224716	0.00226	ops/s
Median Throughput	multi_term_agg	0.222732	0.224588	0.00186	ops/s
Max Throughput	multi_term_agg	0.223353	0.226172	0.00282	ops/s
50th percentile latency	multi_term_agg	350647	347090	-3557.05	ms
90th percentile latency	multi_term_agg	489341	485132	-4208.94	ms
99th percentile latency	multi_term_agg	520187	516024	-4163.12	ms
100th percentile latency	multi_term_agg	521879	517783	-4096.09	ms
50th percentile service time	multi_term_agg	4483.31	4506.97	23.6614	ms
90th percentile service time	multi_term_agg	4644.73	4626.37	-18.353	ms
99th percentile service time	multi_term_agg	4690.78	4866.29	175.507	ms
100th percentile service time	multi_term_agg	4711.89	4986.95	275.058	ms
error rate	multi_term_agg	0	0	0	%
Min Throughput	scroll	25.0498	25.0438	-0.00595	pages/s
Mean Throughput	scroll	25.0819	25.0721	-0.00981	pages/s
Median Throughput	scroll	25.0745	25.0656	-0.0089	pages/s
Max Throughput	scroll	25.1485	25.1306	-0.0179	pages/s
50th percentile latency	scroll	209.486	207.343	-2.14297	ms
90th percentile latency	scroll	214.075	210.843	-3.23216	ms
99th percentile latency	scroll	260.846	264.671	3.82584	ms
100th percentile latency	scroll	283.995	291.279	7.28326	ms
50th percentile service time	scroll	207.623	205.387	-2.23612	ms
90th percentile service time	scroll	211.888	208.758	-3.12957	ms
99th percentile service time	scroll	258.878	262.554	3.67609	ms
100th percentile service time	scroll	281.69	289.84	8.14999	ms
error rate	scroll	0	0	0	%
Min Throughput	desc_sort_size	1.00319	1.0032	1e-05	ops/s
Mean Throughput	desc_sort_size	1.00388	1.00389	1e-05	ops/s
Median Throughput	desc_sort_size	1.00383	1.00384	1e-05	ops/s
Max Throughput	desc_sort_size	1.00478	1.00479	1e-05	ops/s
50th percentile latency	desc_sort_size	7.67193	7.20233	-0.4696	ms
90th percentile latency	desc_sort_size	8.26067	7.99693	-0.26374	ms
99th percentile latency	desc_sort_size	9.19009	8.81847	-0.37162	ms
100th percentile latency	desc_sort_size	9.25557	8.96338	-0.29219	ms
50th percentile service time	desc_sort_size	5.80691	5.40942	-0.39748	ms
90th percentile service time	desc_sort_size	6.34789	5.89622	-0.45167	ms
99th percentile service time	desc_sort_size	7.17157	6.66956	-0.502	ms
100th percentile service time	desc_sort_size	7.36431	6.68186	-0.68245	ms
error rate	desc_sort_size	0	0	0	%
Min Throughput	asc_sort_size	1.0032	1.00323	3e-05	ops/s
Mean Throughput	asc_sort_size	1.00389	1.00392	3e-05	ops/s
Median Throughput	asc_sort_size	1.00384	1.00387	3e-05	ops/s
Max Throughput	asc_sort_size	1.00479	1.00483	4e-05	ops/s
50th percentile latency	asc_sort_size	8.39235	8.26643	-0.12592	ms
90th percentile latency	asc_sort_size	9.23733	8.92356	-0.31377	ms
99th percentile latency	asc_sort_size	9.98721	9.54172	-0.4455	ms
100th percentile latency	asc_sort_size	9.99685	9.60887	-0.38799	ms
50th percentile service time	asc_sort_size	6.65887	6.35152	-0.30736	ms
90th percentile service time	asc_sort_size	7.37027	7.0678	-0.30247	ms
99th percentile service time	asc_sort_size	8.08429	7.44685	-0.63744	ms
100th percentile service time	asc_sort_size	8.30731	7.53512	-0.77219	ms
error rate	asc_sort_size	0	0	0	%
Min Throughput	desc_sort_timestamp	1.00316	1.00312	-4e-05	ops/s
Mean Throughput	desc_sort_timestamp	1.00384	1.00379	-5e-05	ops/s
Median Throughput	desc_sort_timestamp	1.00378	1.00374	-5e-05	ops/s
Max Throughput	desc_sort_timestamp	1.00472	1.00466	-6e-05	ops/s
50th percentile latency	desc_sort_timestamp	13.7684	13.5088	-0.25965	ms
90th percentile latency	desc_sort_timestamp	14.6248	14.2314	-0.39336	ms
99th percentile latency	desc_sort_timestamp	16.3723	15.988	-0.3843	ms
100th percentile latency	desc_sort_timestamp	16.813	16.0292	-0.78384	ms
50th percentile service time	desc_sort_timestamp	12.0042	11.8524	-0.15178	ms
90th percentile service time	desc_sort_timestamp	12.5057	12.2587	-0.24695	ms
99th percentile service time	desc_sort_timestamp	14.3426	14.5756	0.233	ms
100th percentile service time	desc_sort_timestamp	14.6647	14.6071	-0.05765	ms
error rate	desc_sort_timestamp	0	0	0	%
Min Throughput	asc_sort_timestamp	1.00327	1.00328	0	ops/s
Mean Throughput	asc_sort_timestamp	1.00398	1.00398	0	ops/s
Median Throughput	asc_sort_timestamp	1.00392	1.00393	0	ops/s
Max Throughput	asc_sort_timestamp	1.00489	1.0049	1e-05	ops/s
50th percentile latency	asc_sort_timestamp	7.98311	8.08182	0.09871	ms
90th percentile latency	asc_sort_timestamp	8.56813	8.76246	0.19433	ms
99th percentile latency	asc_sort_timestamp	9.33831	9.50709	0.16878	ms
100th percentile latency	asc_sort_timestamp	9.5369	10.022	0.48514	ms
50th percentile service time	asc_sort_timestamp	5.97092	6.20787	0.23695	ms
90th percentile service time	asc_sort_timestamp	6.56714	6.69686	0.12972	ms
99th percentile service time	asc_sort_timestamp	7.26511	7.68305	0.41794	ms
100th percentile service time	asc_sort_timestamp	7.36109	8.12289	0.7618	ms
error rate	asc_sort_timestamp	0	0	0	%
Min Throughput	desc_sort_with_after_timestamp	1.00902	1.00899	-3e-05	ops/s
Mean Throughput	desc_sort_with_after_timestamp	1.02402	1.02394	-8e-05	ops/s
Median Throughput	desc_sort_with_after_timestamp	1.01652	1.01647	-5e-05	ops/s
Max Throughput	desc_sort_with_after_timestamp	1.09819	1.09782	-0.00036	ops/s
50th percentile latency	desc_sort_with_after_timestamp	5.98317	6.27938	0.29621	ms
90th percentile latency	desc_sort_with_after_timestamp	6.5393	6.80572	0.26642	ms
99th percentile latency	desc_sort_with_after_timestamp	6.8667	7.46574	0.59904	ms
100th percentile latency	desc_sort_with_after_timestamp	6.88107	7.57326	0.69219	ms
50th percentile service time	desc_sort_with_after_timestamp	4.20681	4.42068	0.21387	ms
90th percentile service time	desc_sort_with_after_timestamp	4.5541	4.76673	0.21263	ms
99th percentile service time	desc_sort_with_after_timestamp	5.08222	5.54311	0.46089	ms
100th percentile service time	desc_sort_with_after_timestamp	5.23009	5.58049	0.35039	ms
error rate	desc_sort_with_after_timestamp	0	0	0	%
Min Throughput	asc_sort_with_after_timestamp	1.00906	1.00906	-0	ops/s
Mean Throughput	asc_sort_with_after_timestamp	1.02412	1.02411	-0	ops/s
Median Throughput	asc_sort_with_after_timestamp	1.01659	1.01659	-0	ops/s
Max Throughput	asc_sort_with_after_timestamp	1.09864	1.09855	-9e-05	ops/s
50th percentile latency	asc_sort_with_after_timestamp	5.53987	5.39147	-0.1484	ms
90th percentile latency	asc_sort_with_after_timestamp	5.91514	5.85555	-0.05959	ms
99th percentile latency	asc_sort_with_after_timestamp	6.20781	6.20304	-0.00477	ms
100th percentile latency	asc_sort_with_after_timestamp	6.20875	6.30199	0.09324	ms
50th percentile service time	asc_sort_with_after_timestamp	3.65936	3.60652	-0.05284	ms
90th percentile service time	asc_sort_with_after_timestamp	3.81855	3.74439	-0.07415	ms
99th percentile service time	asc_sort_with_after_timestamp	3.94172	3.92286	-0.01886	ms
100th percentile service time	asc_sort_with_after_timestamp	3.94465	4.0068	0.06214	ms
error rate	asc_sort_with_after_timestamp	0	0	0	%
Min Throughput	range_size	2.00953	2.00955	1e-05	ops/s
Mean Throughput	range_size	2.01319	2.0132	1e-05	ops/s
Median Throughput	range_size	2.01268	2.01269	1e-05	ops/s
Max Throughput	range_size	2.01888	2.0189	3e-05	ops/s
50th percentile latency	range_size	8.52367	8.27167	-0.25199	ms
90th percentile latency	range_size	9.10341	8.83461	-0.2688	ms
99th percentile latency	range_size	9.8454	10.0267	0.18131	ms
100th percentile latency	range_size	10.0394	10.1073	0.06786	ms
50th percentile service time	range_size	7.26068	6.9538	-0.30688	ms
90th percentile service time	range_size	7.49863	7.30531	-0.19332	ms
99th percentile service time	range_size	8.3443	8.47684	0.13253	ms
100th percentile service time	range_size	8.40524	8.55189	0.14665	ms
error rate	range_size	0	0	0	%
Min Throughput	range_with_asc_sort	2.00853	2.00832	-0.00022	ops/s
Mean Throughput	range_with_asc_sort	2.01181	2.01152	-0.00029	ops/s
Median Throughput	range_with_asc_sort	2.01135	2.01108	-0.00027	ops/s
Max Throughput	range_with_asc_sort	2.01693	2.0165	-0.00043	ops/s
50th percentile latency	range_with_asc_sort	19.3013	18.8966	-0.40474	ms
90th percentile latency	range_with_asc_sort	21.44	20.8522	-0.5878	ms
99th percentile latency	range_with_asc_sort	22.3658	22.2688	-0.097	ms
100th percentile latency	range_with_asc_sort	22.408	22.4174	0.00942	ms
50th percentile service time	range_with_asc_sort	17.5966	17.4148	-0.18179	ms
90th percentile service time	range_with_asc_sort	19.992	19.218	-0.77396	ms
99th percentile service time	range_with_asc_sort	20.5763	20.434	-0.14236	ms
100th percentile service time	range_with_asc_sort	20.6325	20.5176	-0.11491	ms
error rate	range_with_asc_sort	0	0	0	%
Min Throughput	range_with_desc_sort	2.0093	2.00933	3e-05	ops/s
Mean Throughput	range_with_desc_sort	2.01286	2.0129	4e-05	ops/s
Median Throughput	range_with_desc_sort	2.01237	2.01242	6e-05	ops/s
Max Throughput	range_with_desc_sort	2.01843	2.01852	9e-05	ops/s
50th percentile latency	range_with_desc_sort	20.9821	20.7926	-0.18952	ms
90th percentile latency	range_with_desc_sort	24.4813	24.532	0.0507	ms
99th percentile latency	range_with_desc_sort	25.4849	33.4369	7.95205	ms
100th percentile latency	range_with_desc_sort	25.504	41.1489	15.6449	ms
50th percentile service time	range_with_desc_sort	18.6587	18.6428	-0.01594	ms
90th percentile service time	range_with_desc_sort	22.2279	22.7379	0.51001	ms
99th percentile service time	range_with_desc_sort	23.3495	31.1639	7.81438	ms
100th percentile service time	range_with_desc_sort	23.4663	38.9406	15.4744	ms
error rate	range_with_desc_sort	0	0	0	%

Signed-off-by: Asim Mahmood <asim.seng@gmail.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (2)
1052-1052: Unused field may be orphaned code.

The filterField is added to every document but is never used in any query or assertion in the tests. Consider either:

Removing it if it's not needed

Adding a comment explaining its intended future use

1057-1063: Consider expanding the comment for clarity.

The comment briefly states "delete a doc to avoid approx optimization" but doesn't explain the mechanism. A slightly more descriptive comment would help future maintainers understand why deleted documents prevent the approximation path.
-        // delete a doc to avoid approx optimization
+        // Add and delete a document to create deleted docs in the segment.
+        // This prevents the filter rewrite approximation optimization from being used,
+        // ensuring the skiplist collector path is exercised during collection.
         if (enableSkiplist) {

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 44bbcb3 and f2ff706.

📒 Files selected for processing (2)

CHANGELOG.md (1 hunks)
server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

CHANGELOG.md

🧰 Additional context used

🧬 Code graph analysis (1)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (1)

server/src/main/java/org/opensearch/index/mapper/NumberFieldMapper.java (1)

NumberFieldMapper (102-2185)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)

GitHub Check: gradle-check
GitHub Check: precommit (21, ubuntu-24.04-arm)
GitHub Check: Analyze (java)
GitHub Check: precommit (25, ubuntu-latest)
GitHub Check: precommit (25, ubuntu-24.04-arm)
GitHub Check: precommit (21, windows-latest)
GitHub Check: precommit (21, windows-2025, true)
GitHub Check: precommit (21, ubuntu-latest)
GitHub Check: precommit (25, macos-15)
GitHub Check: precommit (21, macos-15-intel)
GitHub Check: precommit (25, windows-latest)
GitHub Check: precommit (25, macos-15-intel)
GitHub Check: precommit (21, macos-15)
GitHub Check: assemble (25, ubuntu-latest)
GitHub Check: assemble (21, ubuntu-latest)
GitHub Check: assemble (25, windows-latest)
GitHub Check: assemble (21, ubuntu-24.04-arm)
GitHub Check: assemble (25, ubuntu-24.04-arm)
GitHub Check: detect-breaking-change
GitHub Check: assemble (21, windows-latest)
GitHub Check: Mend Security Check

🔇 Additional comments (8)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (8)

36-51: LGTM!

Imports are properly added and all are used in the new test infrastructure.

102-103: LGTM!

The comment addresses the previously requested explanation for using @timestamp. This aligns with OpenSearch conventions where @timestamp is commonly used for time-series data.

994-1003: LGTM!

Clean backward-compatible overload pattern that maintains default behavior while allowing tests to opt into the skiplist path.

1094-1130: LGTM!

Well-structured test that creates a dataset spanning both hourly and monthly granularities to exercise the rounding change behavior. The assertions properly verify histogram correctness, bucket count limits, document count totals, and temporal ordering.

1137-1178: LGTM!

Good test coverage for the interaction between the skiplist collector and sub-aggregations. Verifies that stats sub-aggregations are correctly populated even when rounding changes occur during collection.

1185-1225: LGTM!

Comprehensive test for bucket merging behavior. The dataset design (hourly data + yearly quarterly data) forces significant rounding changes, and the hardcoded expected values correctly account for how the 140 documents should be distributed across yearly buckets.

1231-1284: LGTM!

Excellent addition that addresses the previously requested equivalence test. The randomized approach with 20-200 documents and random date distribution provides good coverage. The comparison of bucket counts, keys, and doc counts ensures the skiplist path produces identical results to the non-skiplist path.

1286-1307: LGTM!

Clean helper method that encapsulates the aggregation execution logic, avoiding code duplication in testSkiplistEquivalence.

Signed-off-by: Asim Mahmood <asim.seng@gmail.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (2)
102-103: Consider expanding the comment to explain why @timestamp is used.

Per past review feedback, the comment could clarify that @timestamp is conventionally configured with skip_list: true in typical log/metrics index mappings, making it a suitable choice for testing skiplist behavior.
-    // @timestamp field name by default uses skip_list
+    // @timestamp field name is used because it conventionally has skip_list enabled
+    // in typical log/metrics index mappings, making it suitable for skiplist tests
     private static final String DATE_FIELD = "@timestamp";
1056-1062: Clarify the optimization being avoided and consider using Term directly.

The comment could be more specific about which approximation optimization is being avoided. Additionally, using Term directly is more idiomatic for document deletion.
-        // delete a doc to avoid approx optimization
+        // Add and delete a document to ensure segments have deletions,
+        // which disables the count approximation optimization and forces
+        // the skiplist collector path to be exercised
         if (enableSkiplist) {
             document.add(new StringField("someField", "a", Field.Store.NO));
             indexWriter.addDocument(document);
             indexWriter.commit();
-            indexWriter.deleteDocuments(new TermQuery(new Term("someField", "a")));
+            indexWriter.deleteDocuments(new Term("someField", "a"));
         }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2ff706 and 350729d.

📒 Files selected for processing (1)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (2)

plugins/arrow-flight-rpc/src/main/java/org/opensearch/arrow/flight/stats/FlightMetrics.java (1)

Histogram (409-449)

server/src/main/java/org/opensearch/index/mapper/NumberFieldMapper.java (1)

NumberFieldMapper (102-2185)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: gradle-check
GitHub Check: assemble (25, windows-latest)
GitHub Check: detect-breaking-change
GitHub Check: Analyze (java)
GitHub Check: assemble (21, ubuntu-24.04-arm)
GitHub Check: assemble (25, ubuntu-24.04-arm)
GitHub Check: assemble (25, ubuntu-latest)
GitHub Check: assemble (21, ubuntu-latest)
GitHub Check: assemble (21, windows-latest)
GitHub Check: precommit (21, windows-2025, true)
GitHub Check: precommit (21, ubuntu-latest)
GitHub Check: precommit (25, ubuntu-24.04-arm)
GitHub Check: precommit (21, macos-15-intel)
GitHub Check: precommit (21, macos-15)
GitHub Check: precommit (21, windows-latest)
GitHub Check: precommit (25, ubuntu-latest)
GitHub Check: precommit (25, macos-15-intel)
GitHub Check: precommit (25, windows-latest)
GitHub Check: precommit (25, macos-15)
GitHub Check: precommit (21, ubuntu-24.04-arm)

🔇 Additional comments (5)

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java (5)

1093-1129: LGTM!

The test correctly validates skiplist collector behavior with rounding changes. The dataset construction (24 hourly + 30 spanning months) effectively triggers rounding changes, and the assertions verify the expected outcomes.

1136-1177: LGTM!

Good coverage for verifying sub-aggregations work correctly when skiplist collector handles rounding changes. The test properly validates that stats sub-aggregations are present and have valid values for non-empty buckets.

1184-1224: LGTM!

Excellent test for bucket merging behavior. The expected counts are correctly calculated:

2020: 120 hourly docs (5 days × 24 hours) + 4 quarterly docs = 124

2021-2024: 4 quarterly docs each

The explicit expectedDocCount map provides strong validation of correct bucket merging after rounding increases.

1230-1283: LGTM - addresses past review feedback!

This test directly addresses the reviewer request for comparing skiplist vs non-skiplist results on random data. The randomized approach with sorted dataset ensures thorough coverage while maintaining deterministic comparison between the two paths.

1285-1306: LGTM!

Clean helper method that properly manages resources and follows the established patterns in the test class.

github-actions · 2025-12-06T00:08:41Z

❌ Gradle check result for 350729d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-12-06T02:31:27Z

✅ Gradle check result for 350729d: SUCCESS

...in/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java

Signed-off-by: Asim Mahmood <asim.seng@gmail.com>

github-actions · 2025-12-08T19:53:45Z

✅ Gradle check result for 9d011ca: SUCCESS

Signed-off-by: Asim Mahmood <asim.seng@gmail.com> (cherry picked from commit dfef2c1) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…#20191) (cherry picked from commit dfef2c1) Signed-off-by: Asim Mahmood <asim.seng@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

github-actions bot added Search:Aggregations Search:Performance v3.4.0 Issues and PRs related to version 3.4.0 labels Nov 19, 2025

asimmahmood1 self-assigned this Nov 19, 2025

asimmahmood1 added this to Performance Roadmap Nov 19, 2025

github-project-automation bot moved this to Todo in Performance Roadmap Nov 19, 2025

asimmahmood1 requested a review from jainankitk November 19, 2025 18:41

asimmahmood1 closed this Nov 24, 2025

github-project-automation bot moved this from Todo to Done in Performance Roadmap Nov 24, 2025

asimmahmood1 reopened this Nov 24, 2025

github-project-automation bot moved this from Done to In Progress in Performance Roadmap Nov 24, 2025

This was referenced Nov 24, 2025

Request to approve/deny benchmark run for PR #20057 #20091

Closed

Request to approve/deny benchmark run for PR #20057 #20092

Closed

opensearch-ci-bot mentioned this pull request Nov 24, 2025

[AUTOCUT] Gradle Check Flaky Test Report for WarmIndexSegmentReplicationIT #18157

Open

opensearch-ci-bot mentioned this pull request Nov 25, 2025

[AUTOCUT] Gradle Check Flaky Test Report for MetadataIndexTemplateServiceTests #19058

Open

jainankitk mentioned this pull request Nov 25, 2025

Combining filter rewrite and skip list to optimize sub aggregation #19573

Merged

1 task

asimmahmood1 added 2 commits December 5, 2025 15:15

Add equivalence test case with random values

6623782

Signed-off-by: Asim Mahmood <asim.seng@gmail.com>

Merge remote-tracking branch 'upstream/main' into auto-date-skiplist

f2ff706

coderabbitai bot reviewed Dec 5, 2025

View reviewed changes

Remove unused field

350729d

Signed-off-by: Asim Mahmood <asim.seng@gmail.com>

coderabbitai bot reviewed Dec 5, 2025

View reviewed changes

asimmahmood1 closed this Dec 6, 2025

github-project-automation bot moved this from In Progress to Done in Performance Roadmap Dec 6, 2025

asimmahmood1 reopened this Dec 6, 2025

github-project-automation bot moved this from Done to In Progress in Performance Roadmap Dec 6, 2025

prudhvigodithi added the backport 3.4 Backport to 3.4 branch label Dec 8, 2025

jainankitk approved these changes Dec 8, 2025

View reviewed changes

...in/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java Outdated Show resolved Hide resolved

Removed unnecessary code

9d011ca

Signed-off-by: Asim Mahmood <asim.seng@gmail.com>

asimmahmood1 closed this Dec 8, 2025

github-project-automation bot moved this from In Progress to Done in Performance Roadmap Dec 8, 2025

asimmahmood1 reopened this Dec 8, 2025

github-project-automation bot moved this from Done to In Progress in Performance Roadmap Dec 8, 2025

jainankitk approved these changes Dec 8, 2025

View reviewed changes

jainankitk merged commit dfef2c1 into opensearch-project:main Dec 8, 2025
56 of 64 checks passed

github-project-automation bot moved this from In Progress to Done in Performance Roadmap Dec 8, 2025

opensearch-trigger-bot bot mentioned this pull request Dec 8, 2025

[Backport 3.4] Add skiplist optimization to auto_date_histogram aggregation #20189

Closed

asimmahmood1 deleted the auto-date-skiplist branch December 8, 2025 20:12

prudhvigodithi added backport 3.4 Backport to 3.4 branch and removed backport 3.4 Backport to 3.4 branch labels Dec 8, 2025

opensearch-trigger-bot bot mentioned this pull request Dec 8, 2025

[Backport 3.4] Add skiplist optimization to auto_date_histogram aggregation #20191

Merged

Add skiplist optimization to auto_date_histogram aggregation #20057

Add skiplist optimization to auto_date_histogram aggregation #20057

Conversation

asimmahmood1 commented Nov 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Summary by CodeRabbit

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

asimmahmood1 commented Nov 19, 2025

Query

Result

Uh oh!

asimmahmood1 commented Nov 19, 2025

Uh oh!

jainankitk commented Nov 19, 2025

Uh oh!

asimmahmood1 commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asimmahmood1 commented Nov 24, 2025

range auto date: from 1335 to 139 (89%)

Uh oh!

asimmahmood1 commented Nov 24, 2025

range-auto-date-with-metrics (22% lower)

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

asimmahmood1 commented Nov 24, 2025

Uh oh!

asimmahmood1 commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

opensearch-ci-bot commented Nov 25, 2025

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5174/

Uh oh!

opensearch-ci-bot commented Nov 25, 2025

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/210/

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 6, 2025

Uh oh!

github-actions bot commented Dec 6, 2025

Uh oh!

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

asimmahmood1 commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

asimmahmood1 commented Nov 24, 2025 •

edited

Loading